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Abstract 

Systematic relationships among 12 population groups of five European species of house mice were stu¬ 
died using multivariate morphometric methods. Multiple group principal component analysis 
(MGPCA) was used to assess the contribution of the size component to the total variation. It was 
shown that part of the ‘shape’ information may be resident in the first principal component and, like¬ 
wise, subsequent components may contain residual ‘size’ information. Hence, removing the ‘size-vector’ 
should be done with caution and after an appropriate examination of the data. Canonical variate analy¬ 
sis (CVA) revealed similar results both on ‘size-in’ and ‘size-out’ MGPCA scores. The first canonical 
variate discriminated between the aboriginal and commensal mice lineages, while the second axis identi¬ 
fied species clusters. The third canonical variate separated groups of populations within commensal spe¬ 
cies. Both CVA and cluster analysis demonstrated that (i) M. macedonicus and M. spretus are 
morphologically more similar to each other than either species is to M. spicilegus: (ii) the distance be¬ 
tween M. musculus and M. domesticus is similar to distances among aboriginal (= outdoor) species; 
(iii) interpopulation distance is relatively high compared to interspecific relationships. 


Introduction 

For many years, the systematics of house mice of the genus Mus has been an intricate 
puzzle. In 1943, Schwarz and Schwarz tried to simplify the taxonomy by condensing 
more then 130 known scientific names of Mus into a single species, Mus musculus. They 
recognized 15 subspecies and proposed the evolutionary scenario of a multiple origin of 
commensal mouse taxa from exoanthropic forms. This approach has been later followed 
by other authors (Ellerman and Morrison-Scott 1951; Serafinski 1965; Corbet 1978; 
Reichstein 1978). However, the concept oversimplified hierarchical relationships among 
house mice, ignoring such phenomena as absence of interbreeding between some taxa. 

An advent of biochemical and molecular studies on free-living small mammals some 
15 years ago shed light onto the systematic interrelationships and evolutionary history of 
the house mouse complex (see Boursot et al. 1993; Sage et al. 1993, for recent reviews), 
falsifying Schwarz and Schwarz’s (1943) concept. It has been shown that there are five 
taxa of house mice in Europe representing two major lineages (Marshall and Sage 1981; 
Thaler et al. 1981; Bonhomme et al. 1984): one lineage consists of three aboriginal (Sage 
1981), ‘outdoor', species (M. spretus, M. spicilegus, M. macedonicus), while the other in¬ 
cludes commensal, ‘indoor’, taxa (M. domesticus, M. musculus). Note that the latter two 
may be regarded as subspecies of a single species, M. musculus, by some authors (e. g. 
Bonhomme and Guenet 1989; Auffray et al. 1990 a). Following Marshall, (1981), Ferris 
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et al. (1983), and Sage et al. (1993) and objectives given therein, however, all the Euro¬ 
pean house mouse taxa are treated as distinct species throughout this study. 

Unequivocal genetic identification of investigated animals was used to establish mor¬ 
phological discrimination criteria between some of the mouse taxa (Darviche and Orsini 
1982; Orsini et al. 1983; Kraft 1985; Kratochvil 1986 a, b; Lyalyukhina et al. 1991). An 
array of studies dealing with the morphometries of house mice has been published in last 
two decades; however, these studies were limited in the number of variables involved (i. e. 
uni- or bivariate analyses) and/or in the number of taxa studied (Sans-Coma et al. 1979; 
Darviche and Orsini 1982; Orsini et al. 1983; Palomo et al. 1983; Lyalyukhina et al. 
1991). Several studies employed multivariate methods, yet they were either focused on 
one or a few species (Thorpe et al. 1982; Davis 1983; Scriven and Bauchau 1992) or did 
not take into account the relative contribution of size and shape to the total variation ob¬ 
served (Engels 1980, 1983; Gerasimov et al. 1990; Lavrenchenko 1994). 

In a previous study (Macholan 1996), morphometric and morphological relation¬ 
ships among populations of all five mouse species were evaluated and the taxonomic sta¬ 
tus of central European mouse populations was documented. This analysis indicated that 
the assessment was obscured by the ‘size’ component, which was, to some extent, inde¬ 
pendent of the age structure of a population (see also Thorpe and Leamy 1983); there¬ 
fore, size-adjusting of the data was suggested. Nevertheless, as size and shape are 
assumed to be essentially multivariate concepts (Humphries et al. 1981: Thorpe and 
Leamy 1983) because one measurement cannot encompass the various facets of length, 
width, etc., an appropriate multivariate statistics including neglecting the size influence 
should be employed. 

This study focuses on the multivariate analysis of morphometric relationships among 
European house mouse populations, including the relative importance of ‘size’ and 
‘shape’ components in their morphological differentiation. 


Material and methods 

Mouse skulls used in this study are deposited in collections of the Institute of Landscape Ecology in 
Brno, National Museum in Prague, Museum of Natural History in Vienna, Institute of Zoology in Kiev, 
Charles University in Prague, University of Lausanne and University of Montpellier. 

A total of 297 skulls of five house mouse species was analysed. The material was pooled into 
13 groups (populations): ‘GR’ {Mus macedonicus, Greece, n = 25); ‘AUT’ {M. spicilegus, Austria, 
n = 20); ‘UKR’ (M. spicilegus, Ukraine and Moldavia, n = 23); ‘SPR’ (M. spretiis, France, Spain, Moroc¬ 
co, n = 24); ‘DA’ (M. domesticus, Albania, n = 10); ‘DCH’ (M. d., Switzerland, n = 24); ‘DWM’ (M. d., 
western Mediterranean islands, n = 20); ‘MC’ (M. musculus. Bohemia, n = 25); ‘MM’ (M. m., Moravia, 
n = 25); ‘MS’ {M. m., Slovakia, n = 25); ‘MH’ (M. m, Hungary, n = 25); ‘MU’ (M. m., Ukraine, n = 25); 
‘MSP’ (M. domesticus/musculus hybrids, W Bohemia, n = 25). A detailed description of specific local¬ 
ities is given elsewhere (Macholan 1996) with the only exception of the M. spicilegus population from 
Ukraine and Moldavia which was added to the original material. This sample consisted of mice from 
the Chernomorskiy zapovednik Reserve (n = 4), Tyaginka, Cherson (n = 6 ), Golo-Pristan’sk (n = 3), 
Kirovograd (n = 3), Melovsk (n = 3), Kishinev (n = 4), and Nikolaev (n = 2). 

This study is based on 18 cranial and dental variables (Macholan 1996); namely, width of the upper 
ramus of the zygomatic process of maxilla (A); width of the zygomatic process of maxilla (B); condylo- 
basal length (LCb); basal length (LB); rostral width (LaR); width of the skull per bullae (LaC); zygo¬ 
matic width (LaZ); height of the braincase (hC); length of the diastema (LD); length of the first lower 
molar (LMli); Mi width (LaMli); M 2 length (LM2i); M 3 length (LM3i); M 3 width (LaM3i); length of 
the lower toothrow (LM13i); length of the first upper molar (LMls); M^ length (LM2s); length of the 
upper toothrow (LM13s). 

Only adult individuals were measured. In order to estimate age of the animals under study, the con¬ 
dition of their reproductive organs, their weight (Laurie 1946; Pelikan 1981), and the level of abrassion 
of their molars (Keller 1974) were assessed; where possible, a combination of the three approaches 
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was taken into account (see Macholan 1996, for details). All individuals of doubtful age were exluded 
from subsequent investigations. 

Multiple group principal component analysis (MGPCA, Thorpe 1983 a), based on a pooled within- 
group variance-covariance matrix, was used in order to assess the contribution of within-group compo¬ 
nents to the between-group discrimination. The ‘size’ vector was searched in order to be eventually ex¬ 
tracted from the data. This ‘size’ component is generally the eigenvector corresponding to the first 
principal component; however, three conditions have to be met for such an assumption: an eigenvector 
expressing general size should have coefficients of the same sign (1) and ‘similar’ magnitude (2); and 
the first principal components within localities should have the same orientation (3). The latter condi¬ 
tion can be tested by comparing the first eigenvectors across localities. 

Since substantial between-character differences due to different scales in individual variables was 
expected, the contribution of each character to a component was compared by computing the pooled 
within-population correlation between the character and the component score according to the formula 
(Thorpe 1983 a): 

a- . 

Hij Aj 

Tij =- 

Si 

where r^ is the pooled within-group (within-population) correlation between the ith character and the 
jth eigenvector, aij is the coefficient for ith character for the normalized jth eigenvector, Zj is the latent 
root (eigenvalue) of the jth eigenvector and Si is the pooled within-group standard deviation of the ith 
character. Before computing the correlations, the eigenvectors were normalized so that each compo¬ 
nent coefficient was divided by ^Ea?, where ajj is as defined above and k is the number of characters. 

Correlation coefficients were then compared and their significance was tested (Sokal and Rohlf 
1981; Thorpe and Leamy 1983). Since for p close to ± 1.0, the distribution of sample values of r is mark¬ 
edly asymmetrical, we have to transform r to a function z; standard normal deviate value ts is then de¬ 


fined as zloy, where = 


1 


and z = V? In^-i^; 

1 - r 


; r is the correlation coefficient as defined 


/^ni-3)’ 

above, k is the number of populations, and nj is the sample size of the ith population. Since z is approxi¬ 
mately normally distributed and we are using a parametric standard deviation, tg is compared with ta[„] 
(where a = 0.01). 

Two techniques were employed in order to extract the size vector: one produces new data as princi¬ 
pal component scores with the first eigenvector removed (Thorpe 1983 b), whereas the other is based 
on Burnaby’s (1966) adjustment as suggested in Rohlf and Bookstein (1988). 

Population interrelationships were assessed by subjecting the component scores to canonical variate 
analysis (CVA, Fisher 1936). This multivariate ordination method separates groups so that between- 
group variation is maximized while within-group variation is minimized (Campbell and Atchley 1981). 
As multiple-group PCA uses pooled within-group covariances, CVA performed on all of the MGPCA 
component scores (‘size-in’ analysis) gives the same results as CVA on the original data (Thorpe et al. 
1982; Thorpe 1983 a). CVA computed on MGPCA component scores with the ‘size’ vector extracted 
(‘size-out’ analysis) revealed the same results as CVA performed on BuRNABY-adjusted data. 

Matrices of Mahalanobis generalized distances D^, computed as a part of canonical variate analysis, 
were employed both in the Mantel test comparing the results of ‘size-in’ and ‘size-out’ multivariate ana¬ 
lyses, and subjected to cluster analysis. 

The System for Statistics (SYSTAT, Release 5.02, Wilkinson 1990) and Numerical Taxonomy Sys¬ 
tem (NTSYS-pc, Version 1.60, Rohlf 1990) packages were used for all the statistical analyses. 


Results 

As stated above there are three assumptions for the first principal component, extracted 
from the pooled within-population covariance matrix, to be treated as the ‘size’ vector. 
The first principal component within localities appeared to be of the same orientation as 
substantiated by checking the signs of the first eigenvectors for each locality. Further¬ 
more, the coefficients corresponding to the first principal axis were all of the same sign 
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(see the first column in Tab. 1). However, differences in their magnitude were strikingly 
high. Since the (pooled) variance-covariance matrix was used, the relative magnitude of 
the coefficients was dependent on the variances of the original data, i. e., on the scale of 
the respective characters. Therefore, two transformation techniques were used in order to 
decrease the differences in the variances: firstly, the variates were converted to loga¬ 
rithms, and secondly, the data were normalized by subtracting the mean and dividing by 
the standard deviation as provided by the standard SYSTAT routine. 

Either transformation of data is only feasible under the expectation of a general im¬ 
provement in linearity. Since non-linear relations between variables would result in a low¬ 
er inter-character correlation, an improvement in linearity should generally be apparent 
by a larger first eigenvalue in the correlation matrix. As shown in table 1, both the trans¬ 
formations yielded a slight (although insignificant) decrease in curvilinearity between the 
variables. If we compare the total variance explained by the first principal component for 
the three data sets we can see a steady decrease in the percentage from the original to the 
normalized data sets. 

Although logging the variates reduced the differences among individual component 
coefficients of the first PC their magnitude still remained highly heterogeneous. More¬ 
over, whereas six characters showed insignificant character-component correlations in the 
raw data, this number was increased to as many as nine in log transformed characters (all 
of them being the tooth measures). Hence, it is obvious that log-transforming data may 


Table 1. Principal component coefficients of the first normalized eigenvector of MGPCA (left col¬ 
umns) and character-component correlation coefficients (right columns) for raw (RAW), log-trans¬ 
formed (LOG) and normalized (NORM) data. Nonsignificant correlations are in parentheses. Below, 
the percentage of the total variance explained by the first principal component, the variance explained 
by the first three components (all the eigenvalues being extracted from the covariance matrix), and the 
proportion of the first eigenvalue computed from the correlation matrix, respectively, are given. 


Character 


RAW 

Coefficients/ Correlations 

LOG 

NORM 

A 

0.033 

0.47 

0.774 

0.89 

0.194 

0.50 

B 

0.038 

0.50 

0.392 

0.66 

0.169 

0.54 

LCb 

0.652 

1.00 

0.136 

0.60 

0.397 

0.90 

LB 

0.632 

1.00 

0.150 

0.62 

0.411 

0.90 

LaR 

0.097 

0.59 

0.134 

0.48 

0.312 

0.70 

LaC 

0.128 

0.57 

0.062 

0.39 

0.313 

0.70 

LaZ 

0.296 

0.80 

0.130 

0.56 

0.327 

0.80 

hC 

0.084 

0.38 

0.056 

0.25 

0.224 

0.52 

LD 

0.229 

0.88 

0.161 

0.50 

0.341 

0.78 

LMli 

0.004 

(0.10) 

0.013 

(0.07) 

0.119 

0.32 

LaMli 

0.007 

0.23 

0.025 

(0.11) 

0.149 

0.42 

LM2i 

0.005 

(0.13) 

0.016 

(0.05) 

0.137 

0.35 

LM3i 

0.002 

(0.06) 

0.013 

(0.03) 

0.082 

0.25 

LaM3i 

0.002 

(0.06) 

0.013 

(0.04) 

0.128 

0.30 

LM13i 

0.013 

0.16 

0.014 

(0.07) 

0.132 

0.43 

LMls 

0.000 

(0.01) 

0.012 

(0.04) 

0.084 

0.22 

LM2s 

0.004 

(0.09) 

0.027 

(0.08) 

0.117 

0.27 

LM13s 

0.017 

0.18 

0.021 

(0.11) 

0.151 

0.43 

1 st V eigenvalue 

81.09% 

42.24% 


33.80% 

3 V eigenvalues 


91.45% 

68 .02% 


61.21% 

1st C eigenvalue 

31.32% 

32.73% 


31.61% 
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not adequately standardize the variance of the characters. In addition, it is apparent from 
table 1 that this transformation substantially changed the relative contribution of indivi¬ 
dual characters to the total variation explained by the component (cf. the coefficients and 
correlations in the RAW and LOG columns of the Tab. 1). 

On the contrary, MGPCA computed on the normalized data revealed all the correla¬ 
tions to be signifcant although the magnitude of the coefficients was neither the same nor 
‘similar’. These results suggest that although the differences in the magnitude of the coef¬ 
ficients of the first principal component were partly due to differences in character var¬ 
iances, there was still some portion of persistent variance which could not be associated 
with the ‘size’ component: this especially concerned the dental measures. Therefore, the 
size-adjustment must be used with caution as these results indicate that the first principal 
component is also very likely to contain some ‘shape’ information which would be lost on 
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c CVl 

Fig. 1 c 



d CVl 

Fig. Id 

Fig. 1. Plots of the first two canonical variate scores for group centroids; (a), (b)-size-in’ analysis; (c), 
(d) - ‘size-out’ analysis; plots (a) and (c) show results of the log-transformed data, whereas (b) and (d) 
concern the normalized data. Minimum spanning trees were superimposed on the plots. 


extraction of the ‘size’ vector; on the other hand, we may expect a small part of residual 
size information to be contained in following (intentionally ‘size-free’) components. Thus 
results of both the ‘size-in’ and ‘size-out’ analyses should be taken into account (accord¬ 
ingly, these terms are rather loose and henceforth will only be used for convenience). 

Figure 1 shows plots of scores for the first two canonical variates. For log-transformed 
and normalized data (Fig. 1 a, c vs. b, d), similar results were revealed. The first variate 
(CVl) apparently separated the two major lineages, i.e. aboriginal and commensal spe¬ 
cies groups, while the second one (CV2) identified individual species (or groupings of po¬ 
pulations) within the lineages. According to relative values of discriminant coefficients. 
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the contrast of A and B (the so-called zygomatic index), and, to a lesser degree, also LM13i 
and LaZ were the variables contributing the most to the first discriminant function. In con- 
strast, in the second canonical variate, the relative contribution was not so clear, with the 
highest coeffcients being those for LB (contrasted by LCb), LMls, LD and B. 

As displayed by the minimum spanning tree, the two M. spicilegus populations formed 
the most remote group within the aboriginal lineage in the two-dimensional discriminant 
space, whilst M. spretus was closer to M. macedonicus. When ‘size-in' and ‘size-out’ ana¬ 
lyses were compared, the patterns were similar except for the changed relative position of 
the Ukrainian and Austrian spicilegus samples, and the Albanian M. domesticus popula¬ 
tion which tended to be closer to M. musculus populations in both the ‘size-in’ analyses 
contrary to the ‘size-out’ ones. 



Fig. 2. A three-dimensional plot of the first three canonical variate scores for the ‘size-out’ CVA on the 
normalized data. Group centroids are connected by the minimum spanning tree. 

A three-dimensional plot of the first three canonical variates is presented in figure 2 
for the ‘size-out’ CVA on normalized data. The third canonical axis, based mostly on the 
relative rostral width (LaR as compared to LCb), placed the Albanian mice into the do- 
mesticiis cluster and separated Bohemian and Moravian M. musculus populations from 
the Slovakian, Hungarian, and Ukrainian ones. Within the aboriginal group the species 
were not distantly separated by CV3 with, again, M. spretus being between the two east¬ 
ern short-tailed indoor species. There were no substantial differences between the ‘size-in’ 
and ‘size-out’ and between logged and normalized data analyses on the third canonical 
axis. 
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The correctness of canonical discriminations were assessed by a posteriori classifica¬ 
tion tests. In table 2, actual memberships (rows) were tabulated against predicted ones 
(columns) from ‘size-out’ CVA on log-transformed data, where populations were pooled 
within species except the hybrid (MSP) and DA samples. In this analysis, 84.12% cases 
were classified correctly. When the MSP sample was excluded, the correctness increased 
to 91.51-92.25% (Cohen’s Kappa ranging between 0.884 and 0.895) depending on the 
type of the analysis used (see Tab. 3). 


Table 2. A posteriori classification testing the correctness of the assignment of each individual to a par¬ 
ticular group for the ‘size-in’ CVA performed on log-transformed data. Here, actual group membership 
(rows) is tabulated against predicted (columns). The populations are pooled within species except for 

DA and MSP samples. 



MAC 

SPI 

SPR 

DA 

DOM 

MUS 

MSP 

T 

MAC 

24 

0 

0 

0 

1 

0 

0 

25 

SPI 

0 

42 

0 

0 

1 

0 

0 

43 

SPR 

1 

1 

22 

0 

0 

0 

0 

24 

DA 

0 

0 

0 

10 

0 

0 

0 

10 

DOM 

0 

0 

0 

4 

37 

1 

2 

44 

MUS 

0 

0 

0 

5 

7 

94 

18 

124 

MSP 

0 

0 

0 

0 

4 

3 

19 

26 

T 

25 

43 

22 

19 

50 

98 

39 

296 


Table 3. A comparison of the correctness of canonical discriminations on different data sets; LOGIN, 
‘size-in’ CVA on logged data; LOGOUT, ‘size-out’ CVA on logged data; NORMIN, ‘size-in’ CVA on 
normalized data; NORMOUT, ‘size-out’ CVA on normalized data. In columns, percentages of erro¬ 
neous assignment for each species (plus DA sample, MSP excluded), total classification error (in %), 
and Cohen’s Kappa are given, respectively. Cohen’s Kappa is an association measure testing if counts 
along the diagonal in Table 2 are significantly greater than those expected by chance alone; values great¬ 
er than 0.75 are usually said to indicate strong agreement (Wilkinson 1990). 



MAC 

SPI 

SPR 

DA 

DOM 

MUS 

Total 

Kappa 

LOGIN 

4.00 

2.33 

8.33 

0.00 

11.36 

9.68 

7.78 

0.894 

LOGOUT 

4.00 

0.00 

8.33 

0.00 

11.36 

10.48 

7.78 

0.894 

NORMIN 

4.00 

0.00 

8.33 

0.00 

9.09 

11.20 

7.75 

0.895 

NORMOUT 

8.00 

0.00 

8.33 

0.00 

13.64 

10.40 

8.49 

0.884 


The ‘size-in’ and ‘size-out’ analyses were compared so that matrices were plotted 
against each other using the Mantel procedure (NTSYS-pc) separately for each data set 
(Fig. 3). In both cases, CVA revealed similar results when distances were close to the diag¬ 
onal (product-moment correlations r = 0.998 and r = 0.991, respectively). However, there 
were some differences between size-adjusted and non-adjusted data, especially in the nor¬ 
malized variables, mainly due to M. macedonicus which tended to show higher distances 
in the ‘size-in’ CVA compared to the ‘size-out’ analyses. 

Results of a UPGMA cluster analysis based on the Mahalanobis distances are 
shown in figure 4. Because of the hybrid nature of the MSP population this sample was 
excluded from the clustering. Interestingly, the ‘size-in’ and ‘size-out’ procedures gave the 
same trees, while there was a difference between log-transformed and normalized data 
sets: the Swiss M. domesticiis (DCH) appeared in the M. miisculiis MC + MM cluster with 
the logged data (Fig. 4 a), whereas both the species made separate clusters 





312 


M. Macholan 



a size-out 

Fig. 3 a 



b size-out 

Fig. 3 b 

Fig. 3. Mantel plots of ‘size-in’ (ordinate) and ‘size-out’ (abscissa) Mahalanobis distances betweeen 
group centroids; (a) log-transformed data, r = 0.998; (b) normalized data, r = 0.991. In both the cases, 
M. macedonicus is marked by triangles, while all other populations are indicated by circles. 
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Fig. 4. UPGMA dendrograms based on the Mahalanobis distances; (a) log-transformed data; (b) nor¬ 
malized data. Both the ‘size-in’ and ‘size-out’ analyses gave the same results for each type of the data 
transformation; MSP hybrid sample was excluded from the clustering. 
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({[MC + MM) + [[MS + MU] + MHj) and {DA + [DCH + DWMj), respectively) with the 
normalized data (Fig. 4 b). Albanian mice, although being quite distantly related, clus¬ 
tered with M. domesticus in both cases. What should be noted in the dendrograms is the 
small distance between M. macedonicus and M. spretus in comparison to the distance be¬ 
tween both the M. spicilegus populations and among populations within the domesticus 
and miisculus clusters. The mound-building mouse (M. spicilegus) thus appeared to be 
morphologically the most distant species within aboriginal mice, yet interspecific morpho¬ 
logical distances within this group were, in general, strikingly small. 


Discussion 

Although size can provide significant information on morphometric differences among 
taxa, it is sometimes desirable to avoid size variation as it may cause a substantial bias in 
an assessment of group interrelationships due to the growth allometry (Rohrs 1961; 
Thorpe 1976, 1983 a). This especially concerns organisms with indeterminate growth; how¬ 
ever, nutritional, seasonal, sexual, ecological and other factors are also likely to affect 
morphological characters (Leamy 1981) and thus the size-adjusting of data may be neces¬ 
sary. 

Several techniques have been employed to remove the size component from this ana¬ 
lysis. In morphometries, the most familiar methods are those using ratios; however, it has 
been argued repeatedly that because of some undesirable statistical properties and con¬ 
ceptual difficulties ratios should be avoided (Atchley et al. 1976; Corruccini 1977; Atch- 
LEY 1978; Atchley and Anderson 1978). Nor does taking logarithms of the ratios 
(Blackith and Reyment 1971; Dodson 1978; Hills 1978) entirely remove size from data 
as stated by Humphries et al. (1981) and evidenced by Reist (1985). Another possibility 
is a univariate regression analysis of variables on a standard size measurement such as 
snout-vent length in reptiles and amphibians, standard length in fish, wing length in birds, 
or condylobasal length in mammal skulls (Thorpe 1975; Corruccini 1977; Kuhry and 
Marcus 1977). However, since size is designated a single variable in these techniques, 
only one particular variable is partialed out. As pointed out by Humphries et al. (1981) 
and Thorpe and Leamy (1983) size is not equal to any single measurement and using a 
multivariate ordination method for comparing size and shape differences among groups is 
more appropriate. 

Multiple group principal component analysis (MGPCA) is now widely used in var¬ 
ious types of studies (Thorpe 1983 b; Corti and Thorpe 1989; Allegrucci et al. 1992; 
Bekele et al. 1993) as a method of evaluating the relative contribution of size and 
shape to the between-group variation and to extract the size component from data. 
Nevertheless, some criticisms have appeared concerning the biological meaning of the 
general size aspect of the first principal component (Shea 1985), and/or pointing out 
that the first component may contain shape information and remaining vectors may re¬ 
tain size information (reviewed in Humphries et al. 1981; Reist 1985). While the first 
criticism is irrelevant to this study, the second may pose a problem, because even 
though the coefficients related to the first principal component appeared all to be of the 
same sign, their magnitude was very different even after transformation. This may indi¬ 
cate some residual size component could be resident in the second and following axes, 
whereas a proportion of shape information is likely to be lost with an extraction of the 
first component. 

Comparison of two transformation techniques employed for standardization of the 
measurements with different scales (log-transformation and normalization) proved to be 
of interest. While both approaches resulted in a slight improvement in between-variable 
linearity, logging the data (probably the most widely used method in morphological stu- 
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dies) tended to change the relative contribution of variables to the variation explained by 
principal components. Moreover, the influence of non-equality of the variances of the ori¬ 
ginal characters, although being lowered relative to the raw data, was not entirely re¬ 
strained, and in turn the number of variables with insignificant character-component 
correlations was even increased by 50% in the log-transformed vs. original data. Normali¬ 
zation, on the other hand, resulted in better results both in the standardization of the var¬ 
iances and in the relative contribution of individual variables. 

The results of CVA computed on the MGPCA scores revealed similar results for both 
the normalized and log-transformed variates and only slight differences between the ‘size- 
in’ and ‘size-out’ analyses. In all the cases, the two separate evolutionary and ecological 
lineages were clearly discriminated. Within the aboriginal lineage, the most distant species 
were M. macedonicus and M. spicilegus with M. spretus being morphologically intermedi¬ 
ate between them. This result is rather surprising given the close genetic relationships be¬ 
tween the former two species (Bonhomme et al. 1983, 1984). Likewise, Gerasimov et al. 
(1990), using the stepwise discriminant analysis, found these forms to be morphologically 
very similar. On the contrary, in a previous study focused mostly on uni- and bivariate 
analyses (Macholan 1996) M. spicilegus appeared to be closer to M. spretus when origi¬ 
nal untransformed data were analysed, whereas the latter showed greater similarity with 
M. macedonicus when the variables were size-adjusted using Thorpe’s (1975) allometric 
formula. A comparison of the univariate (Macholan 1996) and multivariate (this study) 
study carried out on the same material shows the former to be more affected by the 
growth/size influence than the latter. It is not clear, however, to what extent the close si¬ 
milarity between M. macedonicus and M. spretus reflects the circum-Mediterranean ecolo¬ 
gical vicariance of the two species (Auffray et al. 1990 b). 

Within the commensal lineage, a somewhat peculiar position was displayed by the 
M. domesticus sample from Albania (considered as M. d. brevirostris, Reichstein 1978; 
Marshall 1981, but see the discussion about validity of subspecific categories in 
M. domesticus in Ferris et al. 1983; Sage et al. 1986; Macholan 1996) which was rather 
distinct both from other M. domesticus populations and from M. musculus, mainly due to 
its relatively narrow rostrum (Macholan 1996). However, since 10 Albanian individuals 
were only studied, more animals should be investigated and perhaps other measurements 
should be included before the systematic relationships of Albanian and other commensal 
house mice can be established. 

In his multivariate morphometric analyses of house mice from eastern Europe and 
central Asia, Lavrenchenko (1990, 1994) found the variation within M. musculus to be 
categorical rather than clinal and this led him to distinguish three subspecies: musculus 
from the European part of the former USSR, southern Siberia and eastwards to the Far 
East; wagneri from lowlands north of the Caspian Sea, Kazakhstan, and ex-Soviet Central 
Asia; and raddei from eastern Kazakhstan, Altai, most of Mongolia, and eastern Transbai¬ 
kalia. 

In this study, the investigation of M. musculus populations from western parts of its 
range showed quite different patterns and the variation changed rather continually in 
the east-west direction. This conclusion is corroborated by the results of the univariate 
analysis (Macholan 1996) where some measurements were shown to be similar to 
M. domesticus in western localities, especially when the raw variates were taken. This 
suggests a possibility of introgressing domesticus alleles into the musculus range across 
the hybrid zone in western Bohemia and south-eastern Germany (Sage et al. 1986; 
Tucker et al. 1992; Macholan and Zima 1994). The introgression of polygenic traits is 
similar to that of biochemical markers (Macholan and Zima 1994 and unpubl. results) 
but the gene-flow distance might be much higher as indicated by the results of the pre¬ 
sent study. 
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Zusammenfassung 

Morphometrische multivariate Analyse europaischer Arten der Gattung Mus (Mammalia, Muridae) 

Die systematischen Beziehungen zwischen 12 Populationsgruppen von fiinf europaischen Hausmausar- 
ten wurden mit Hilfe verschiedener Methoden zur morphometrischen Multivarianzanalyse untersucht. 
Die ,Multiple group principal component analysis‘ (MGPCA) wurde dazu genutzt, den Beitrag der 
GroBe zur Gesamtvariation zu beurteilen. Es wurde gezeigt, daB ein Teil der ,Form‘-Information in der 
ersten Hauptkomponente enthalten sein kdnnte, und weitere Komponenten ahnlich dazu eine residuale 
,GrdBen‘-Information beinhalten konnten. Deshalb sollte das Eliminieren der ,GroBen‘-Information 
mit Vorsicht und erst nach einer angemessenen Uberpriifung der Daten vorgenommen werden. Die ka- 
nonische Diskriminanzanalyse brachte ahnliche Resultate wie die ,size-in‘ und ,size-out‘ MGPCA-Un- 
tersuchungen. Die erste kanonische Achse grenzte die Freiland- und die kommensalen Artgruppen der 
Mause voneinander ab, wahrenddessen die zweite Achse Artgruppen identifizierte. Die dritte kano¬ 
nische Achse teilte Populationsgruppen innerhalb der kommensalen Arten ab. Sowohl die CVA als 
auch die Clusteranalyse zeigten, daB (1) M. macedonicus und M. spretus morphologisch miteinander 
mehr Ahnlichkeit haben als eine der beiden Arten mit M. spicilegus\ daB (2) die Distanz zwischen 
M. musculus und M. domesticus ahnlich groB ist wie die Distanz zwischen den Freilandarten; und daB 
(3) die Distanz zwischen den Populationen verglichen mit den zwischenartlichen Beziehungen relativ 
hoch ist. 
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